Embedding Heterogeneous Data by Preserving Multiple Kernels

نویسنده

  • Mehmet Gönen
چکیده

Heterogeneous data may arise in many real-life applications under different scenarios. In this paper, we formulate a general framework to address the problem of modeling heterogeneous data. Our main contribution is a novel embedding method, called multiple kernel preserving embedding (MKPE), which projects heterogeneous data into a unified embedding space by preserving crossdomain interactions and within-domain similarities simultaneously. These interactions and similarities between data points are approximated with Gaussian kernels to transfer local neighborhood information to the projected subspace. We also extend our method for out-of-sample embedding using a parametric formulation in the projection step. The performance of MKPE is illustrated on two tasks: (i) modeling biological interaction networks and (ii) cross-domain information retrieval. Empirical results of these two tasks validate the predictive performance of our algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generalized Similarity Kernels for Efficient Sequence Classification

String kernel-based machine learning methods have yielded great success in practical tasks of structured/sequential data analysis. In this paper we propose a novel computational framework that uses general similarity metrics and distance-preserving embeddings with string kernels to improve sequence classification. An embedding step, a distance-preserving bitstring mapping, is used to effectivel...

متن کامل

Efficient Optimization for Low-Rank Integrated Bilinear Classifiers

In pattern classification, it is needed to efficiently treat twoway data (feature matrices) while preserving the two-way structure such as spatio-temporal relationships, etc. The classifier for the feature matrix is generally formulated by multiple bilinear forms which result in a matrix. The rank of the matrix, i.e., the number of bilinear forms, should be low from the viewpoint of generalizat...

متن کامل

AspEm: Embedding Learning by Aspects in Heterogeneous Information Networks

Heterogeneous information networks (HINs) are ubiquitous in real-world applications. Due to the heterogeneity in HINs, the typed edges may not fully align with each other. In order to capture the semantic subtlety, we propose the concept of aspects with each aspect being a unit representing one underlying semantic facet. Meanwhile, network embedding has emerged as a powerful method for learning...

متن کامل

A Geometry Preserving Kernel over Riemannian Manifolds

Abstract- Kernel trick and projection to tangent spaces are two choices for linearizing the data points lying on Riemannian manifolds. These approaches are used to provide the prerequisites for applying standard machine learning methods on Riemannian manifolds. Classical kernels implicitly project data to high dimensional feature space without considering the intrinsic geometry of data points. ...

متن کامل

Scalable Alignment Kernels via Space-Efficient Feature Maps

String kernels are attractive data analysis tools for analyzing string data. Among them, alignment kernels are known for their high prediction accuracies in string classifications when tested in combination with SVMs in various applications. However, alignment kernels have a crucial drawback in that they scale poorly due to their quadratic computation complexity in the number of input strings, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014